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ABSTRACT 

Recent success in explaining several properties of the dusty torus around the central 
engine of active galactic nuclei has been gathered with the assumption of dumpiness. 
The properties of such clumpy dusty tori can be inferred by analyzing spectral energy 
distributions (SEDs), sometimes with scarce sampling given that large aperture tele- 
scopes and long integration times are needed to get good spatial resolution and signal. 
We aim at using the information already present in the data and the assumption of 
clumpy dusty torus, in particular, the CLUMPY models of Nenkova et al., to evaluate 
the optimum next observation such that we maximize the constraining power of the 
new observed photometric point. To this end, we use the existing and barely applied 
idea of Bayesian adaptive exploration, a mixture of Bayesian inference, prediction and 
decision theories. The result is that the new photometric filter to use is the one that 
maximizes the expected utility, which we approximate with the entropy of the predic- 
tive distribution. In other words, we have to sample where there is larger variability in 
the SEDs compatible with the data with what we know of the model parameters. We 
show that Bayesian adaptive exploration can be used to suggest new observations, and 
ultimately optimal filter sets, to better constrain the parameters of the clumpy dusty 
torus models. In general, we find that the region between 10 and 200 yum produces the 
largest increase in the expected utility, although sub-mm data from ALMA also prove 
to be useful. It is important to note that here we are not considering the angular res- 
olution of the data, which is key when constraining torus parameters. Therefore, the 
expected utilities derived from this methodology must be weighted with the spatial 
resolution of the data. 

Key words: methods: data analysis, statistical — galaxies: active, nuclei, Seyfert 



1 INTRODUCTION 



The unified model fo r active galaxies (|Antonuccil 1 19931 : 
lUrrv fc Padovanill 19951) is based on the existence of a dusty 
toroidal structure surrounding the central region of active 
galactic nuclei (AGN). Considering this geometry of the ob- 
scuring material, the central engines of Type-1 AGN can be 
seen directly, resulting in typical spectra with both narrow 
and broad emission lines, whereas in Type-2 AGN the broad 
line region (BLR) is obscured. 

The infrared (IR) range (and particularly the mid- 
infrared; MIR) is key to characterize the torus, since the 
dust reprocesses the optical and ultraviolet radiation gener- 
ated in the accretion process and re-emits it in this range. 
However, considering the small torus siz^E high angular res- 



olution turns to be essential to isolate as much as possible 
its emission. 

Pioneering wo r k in modelling the dusty torus 
(|Pier fc Kroliklll992l , Il993h assumed a uniform dust density 
distri bution to simplify the mo delling, although from the 
start. [Krolik fc Begelmanl (1 19881 ) realized that smooth dust 
distributions cannot survive within the immediate AGN 
vicinity. To solve the various discrepancies between obser- 
vations and models, an intensive search for an alternative 
torus geometry has been c arried out in the las t decad e. 
The clumpy torus models dNenkova et all |2002| , l2008al lbl: 
iHonig et alj|2006l ; ISchartmann et al.ll2008h "propose that the 
dust is distributed in clumps, instead of homogeneously 
filling the torus volume. These models are making sig- 
nificant progress in accounting for the MIR emission of 
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Less than 10 pc in the c ase of S eyfert galaxies based on ground- 
based MIR imaging (e.g. iPackham et all 12005: Radoms ki et alj 



200S| and interferom etric observations (e.g. Ijaffe et all |2004| ; 



Tristram et~al] [2007) 
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AGN dMason et al. 2006, 


2009; Mor et al. 2009; Horst et al. 


2008, 2009i; Nikutta et al. 


2009; Ramos Almeida et al.l 


2009, 


2011a||b|: iHonie et all l2010l: lAlonso-Herrero et al.l 
2012 allbh. 


2011, 



In previous works we constructed subarcsecond resolu- 
tion IR spectral energy distributions (SEDs) for about 20 
nearby Seyfert galaxies and successfully reproduced them 
with the clumpy torus models of Nenkova et al. (here- 
after CLUMPY). It is worth mentioning, however, that 
some observational results show that the CLUMPY mod- 
els alone cannot explain the IR SEDs of a sa mple of PC 
quasars l|Mor et al.ll2009l : iMor fc Netzerl l2012h . The latter 
authors needed an additional hot dust component to re- 
produce the SEDs. Mo r eover, recent inter ferometry results 
( Kish imoto et al ] 120111 ; iHonig et all I2012T ) indicate that a 
single component torus does not reproduce the observations. 

The CLUMPY database now contains ~5xl0 6 mod- 
els, calculated for a fine grid of model parameters. The 
inherent degeneracy between these parameters has to be 
taken into account when fitting the observables. To this end, 
we developed the Bayesian inference tool BayesCLUMPY. 
Details on the interpola tion methods and algorithms em- 
ployed can be found in lAsensio Ramos fc Ramos Almeida 
(2009). We point out that, given the specificities of the 
Bayesian inference approach we follow, in the following anal- 
ys is we will not be using t he original set of models described 
in iNenkova et a l. (2008a b), but an interpolated version of 
them. 

In [Ramos Almeida et al.1 (|2009l . bOllbl ) we fitted IR 
SEDs constructed using MIR nuclear fluxes obtained with 8 
m telescopes and NIR measu rements of similar resolutio n 
from the literature (see also lAlonso-Herrero et al.1 l201ll ) . 
Some of the SEDs were well-sampled (e.g. the Circinus 
galaxy and Centaurus A) whilst others comprised just three 
photometric data points (e.g. NGC 1365 and NGC 1386). 
The better the sampling of t he SEP, the more constraine d 
the torus parameters (see e.g. lAlonso-Herrero et al. I l2012al ). 
Considering the need for 8-10 m telescopes to isolate the 
torus emission, it is necessary to determine the minimum 
number of filters required to constrain the model parame- 
ters. We utilize the output of our code BayesCLUMPY in a 
Bayesian experiment design framework. Our aim is to design 
the experiment (observation of a source using a selected fil- 
ter) that introduces more constraints for the parameters of 
the CLUMPY models. Using our Bayesian approach, we can 
investigate which and how many optical, IR, and sub-mm 
filters restrict the most the parameter space, as well as which 
wavelengths provide more information about each of the pa- 
rameters. Although here we present results for the models 
of Nenkova et al., the formalism can be applied to any other 
set of models, including multi-component ones. Thus, this 
work can be useful for the community when applying for 
telescope observations. 



CLUMPY DUSTY TORUS MODELS AND 
BAYESIAN APPROACH 



The CLUMPY models of INenkova et al] (|2002T ) hold that 
the dust surrounding the central engine of an AGN is 
distributed in clumps. These clumps are distributed with 
a radial extent Y — R /Rd, where R and Rd are the 
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Figure 1 . Scheme of the clumpy torus described in lNenkova et al] 
(2008a,ij). The radial extent of the torus is defined by the outer 
radius (R ) and the dust sublimation radius (Rd)- AH the clouds 
are supposed to have the same Ty , and <x characterizes the width 
of the angular distribution. The number of cloud encounters is 
function of the viewing angle, i. 



Table 1. Clumpy Model Parameters and Considered Intervals 



Parameter 


Abbreviation 


Interval 


Width of the angular 






distribution of clouds 


a 


[15°, 75°] 


Radial extent of the torus 


Y 


[5, 100] 


Number of clouds along 






the radial equatorial direction 


N 


[1, 15] 


Power-law index of the 






radial density profile 


1 


[0, 3] 


Inclination angle of the torus 


i 


[0°, 90°] 


Optical depth per single cloud 


Ty- 


[5, 150] 



outer and inner radius of the toroidal distribution, re- 
spectively (see Figure [TJ. The inner radius is defined by 
the dust sublimation temperature (T au b ~ 1500 K), with 
R d = 0.4 (1500 K T s ; 1 6 ) 2 - 6 (L/10 45 erg s" 1 ) ' 5 pc. Within 
this geometry, each clump has the same optical depth (tv). 
The average number of clouds along a radial equatorial ray 



is No- The radial density profile is a power-law (oc r~ q ). 
A width parameter, <r, characterizes the angular distribu- 
tion of the clouds, which has a smooth edge. The num- 
ber of clouds along the LOS at an inclination angle i is 
N L os(i) = N e (-('-9 Q) 2 /^ 2 ), For a detailed descri ption of 
the clumpy models see INenkova et al.1 (|2002l . l2008a| jbh. 

In every Bayesian scheme, one needs to spec- 
ify a-priori information about the model parame- 
ters. This is done through the prior distribution (see 
lAsensio Ramos fc Ramos Almeidal 120091 . for more details). 
We consider them to be truncated uniform distributions 
for the six model parameters in the intervals reported in 
Table [TJ Therefore, we give the same weight to all the 
values in each interval. To compare with the observations, 
BayesCLUMPY simulates the effect of the employed filters 
on the SED by integrating the product of the synthetic 
SED and the filter transmission curve. Observational errors 
are assumed to be Gaussian. 

The results of the fitting process of the IR SEDs with 
the CLUMPY models are the posterior distributions for the 



six free parameters that describe the models. When the ob- 
served data introduce sufficient information into the fit, the 
resulting probability distributions will clearly differ from the 
input uniform distributions, either showing trends or being 
centered at certain values within the intervals considered. 



3 BAYESIAN ADAPTIVE EXPLORATION 

In this section we briefly describe our approach to the exper- 
iment design, following the lines of th e theory of Bayesian 
adapt ive e xploration, as developed by Seb astiani fc Wvrml 
( 2000) and ILoredol |2004). The output of this analysis is a 
measure of the "utility" of each one of the available pho- 
tometric filters to constrain the models introduced in Sec. 

m 



3.1 General problem 

The problem at hand can be stated as follows. Let D be 
an TV-dimensional vector containing observations that sam- 
ple an AGN SED using N different filters. A fundamental 
assumption that we have to make (inherent to any infer- 
ence process) is to consider that the SEDs we are analyz- 
ing can be correctly reproduced with the CLUMPY mod- 
els. If we have a set of M new filters available (in our 
case, from the list of Table [2j to carry out new observa- 
tions, we want to know which is the filter that, when we 
acquire a new observation with it, maximizes the amount 
of information that we gain about the model parameters. 
As stated before, this is a crucial problem to be solved 
when applying for observing time in large-aperture tele- 
scopes, given the large overpetition of such telescopes. In 
order to solve this problem, we apply a recent methodol- 
ogy of Bayesian experiment design ter med Bayesian adap- 
tive ex plora tion (BAE), as presented bv lSebastiani fc Wvnnl 
|2000l ) and ILoredol l |2004h . In essence, BAE can be un- 
derstood as an iteration of an observation-inference- design 
scheme. Starting from the current observations, we in- 
fer the model parameters using standard Bayesian tech- 
niques like what can be achieved wit h BayesCLUMPY 
|Asensio Ramos &: Ramos Almeida! l2009h . The inference 
process allows us to predict new data at the light of the infor- 
mation gained from the observations. From these synthetic 
observations, it is possible to predict which new experiments 
will carry more information. The decision of which is the best 
experiment to carry out is not an issue of Bayesian inference, 
but one has to rely o n the foundations of Bayesian decision 
theory (|Bergerl Il985l ). This procedure can be iterated un- 
til convergence, although we will not pursue this issue here, 
only the suggestion of a new observations. We summarize in 
the following the tools that we have used in this paper. 

Let be the vector with the parameters that define 
the CLUMPY model. Once the set of observations D for 
the current filters have been acquired, the role of Bayesian 
inference is to compute the posterior distribution p(0|D, I) 
that encodes all the information we possess about the model 
parameters at the light of the observations and assuming our 
current experiment I. Using the posterior distribution, it is 
possible to predict which is the expected value of a fictitiuos 
new observation o at filter / under the experiment If using 
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Table 2. Filters considered in this work 



ID 


Instrument 


Telescope 


Aq [/Am] 


FWHM [;im] 


Ff Of W 


WFPC2 


HST 


603 


22 


F791 W 


WFPC2 


HST 


o snr 


i ■"■ 


F 1 1 W 


NICMOS 


HST 


1 102 


r (i 
" ''. 


F 1 (">( ) VV 


NICMOS 




1 rnr 




F 1 87N 


NICMOS 


HST 


18 "4 


H I 2 


j ,"■■>■>■> \[ 


NICMOS 




2 216 


111 






VLT 




Or 

' 




rc \ f ( i 






' ■■ r 


s 






n 1 on 


.. .... 




ma r-n 








; 










05 


1, 


VLT 


4 r 6 


,' -. ' 




J n 




4 781 






ISAAC 


VLT 


1 10 


If 




ISAAC 


VLT 


I f - o 


'"'(I 




ISAAC 


VLT 


2 164 


o'o" 


if 


ISAAC 


VLT 


■i 




M 


ISAAC 


VLT 


1 ' f r 7 


10 








i o in 






■ n 






n in 
■ ' 1 




son 


NTT 










i r K I R T 








IR C AM3 


UKIRT 


1 630 






IRC AM3 


I J K I R T 


''■'()() 






I R C A M 3 


UKIRT 








IRC AM3 


UKIRT 


4*758 


' '' 






2 2m ESO 


' 'm 






IRAC-1 


2 2m ESO 


2 161 


9^ 


K , 






2 113 


i 1 


j. 


NSFCam 


IRTF 


3 498 


r 1 




TIMMI2 


i ft-m pen 


2' 'if 1 







TIMMI2 


3 6 m ESO 


~, ' 1 n 5 


n ~ s 




TIMMI2 


■j . o m nou 




,"'. ,L , 


1 


MICHELLE 


Gemini N 


9 1^0 






M I C H E L L E 


emini 


' '-'() 


- r ^ 




MICHELLE 




10 " r> 


101 


N' 


MICHELLE 


Gemini N 


11 30 


■"' 10 


gig 


MICHELLE 


Gemini N 


12 72 


1 16 


Qa 


MICHELLE 


Gemini N 


18.47 


1.95 


Q 


MICHELLE 


Gemini N 


20.86 


8.97 


Si2 


T-ReCS 


Gemini S 


8.725 


0.78 


N 


T-ReCS 


Gemini S 


10.31 


5.24 


Np 


T-ReCS 


Gemini S 


11.35 


2.27 


Si5 


T-ReCS 


Gemini S 


11.65 


1.16 


Qa 


T-ReCS 


Gemini S 


18.34 


1.52 


N 


OSCIR 


CTIO 


10.82 


5.16 


IHW18 


OSCIR 


CTIO 


18.12 


1.61 


Arlll 


VISIR 


VLT 


8.996 


0.13 


SIV 


VISIR 


VLT 


10.46 


0.10 


PAH 2 


VISIR 


VLT 


11.27 


0.59 


PAH2.2 


VISIR 


VLT 


11.74 


0.37 


NeII_l 


VISIR 


VLT 


12.27 


0.19 


NeII_2 


VISIR 


VLT 


13.04 


0.22 


Q2 


VISIR 


VLT 


18.75 


0.86 


CC_Si2 


CanariCam 


GTC 


8.67 


1.04 


CC_N 


CanariCam 


GTC 


10.31 


5.25 


CC_Q4 


CanariCam 


GTC 


20.39 


0.97 


CC_Q8 


CanariCam 


GTC 


24.53 


0.75 


LWC_31.5 


FORCAST 


SOFIA 


31.38 


4.54 


LWC_33.6 


FORCAST 


SOFIA 


33.60 


1.56 


LWC_34.S 


FORCAST 


SOFIA 


34.81 


3.42 


LWC_37.1 


FORCAST 


SOFIA 


37.18 


2.13 


PACS70 


PACS 


Herschel 


71.07 


20.9 


PACS100 


PACS 


Herschel 


102.2 


36.0 


PACS160 


PACS 


Herschel 


165.9 


74.5 


SPIRE250 


SPIRE 


Herschel 


251.6 


77.9 


SPIRE350 


SPIRE 


Herschel 


353.3 


106.6 


SPIRE500 


SPIRE 


Herschel 


508.8 


198.2 


ALMA Ch9 




ALMA 


460.0 


77.3 


ALMA Ch7 




ALMA 


945.0 


280.2 



the well-known predictive distribution (e.g.. lBishod l2006): 
p(o\T>,I f ) = J d0p{o,6\D,I f ) 

= J dO P (o\e,T>,i f )p(o\r>,i f ), (1) 

where the conditioning on D of p(o\0,D, I f) is, in fact, ir- 
relevant once is known. The term p(o\0, D, //) is just the 
likelihood of getting the new observation o. According to the 
Bayesian decision theory, one should choose the filter / that 
maximizes the expected utility (EU), defined as: 

EU(f) = J dop(o\D,I f )U(o,f), (2) 

where the function U (o, /) is the utility, a function that is 
at the core of decision theory and that quantifies the infor- 
mation gain of a new experiment and can also include the 
potential cost of the new experiment at filter / (i.e., we could 
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potentially include factors that, e.g., increase the weight of 
filters that provide a better spatial resolution or that takes 
into account the difficulty of being awarded with observing 
with a certain telescope). In other words, we have to choose 
the filter that maximizes the value of the expected utility 
using the predictions of the flux at filter / according to our 
current knowledge of the model. 

The previous definitions are somewhat obvious and the 
core of Bayesian decision theory is located on the appro- 
priate definiti on of the utilit y function. We follow he re the 
suggestion of iLindlevI |l956f) (see also iLoredd [2004) , who 
suggested that, since we want to maximize the information 
about the model parameters 9, it makes sense to use the 
information encoded on the posterior for 0, as measured by 
the negative differential entropy: 



U(oJ) = -H[e\o,T),I f ], 



(3) 



where the notation means that the entropy is computed for 
the posterior for the distribution p(0\o, D, If). The differen- 
tial entropy is, following the standard definition, given by: 

H[9\o, D, //] = - J d0p(9\o, D, I f ) logp(0|o, D, /,). (4) 

Consequently, we have to compute the following quantity for 
all the available filters 

EU(/) = J J dod0p(o\-D,If) P (0\o,-D,If)lag P (0\o,T>,If),(5) 

and choose the filter / that maximizes it. One could plug 
the expression for the predictive distribution of Eq. fT} on 
the expected utility and end up with a triple multidimen- 
sional integral that needs to be computed for each value of 
/. However, it turns out to be much easier to plug Eqs. (ESl 
and ( B onto Eq. © and apply Bayes theorem fe.g.. lGregorvl 
2005) to logp(0|o, D, //), so that we have to end up with: 



EU(/) = / do dOp(0\T>, If) p(o\0,T>, If) logp(o\e,T>,I f ) 
+ [ dodOp(o\0,-D,If)p(0\-D,If)lag P (0\T>,If) 
dodOp(G\T>, I f )p(o\H, If) logp(o|D, //). (6) 



Using the definition of entropy of Eq. Q, we can rewrite 
the previous expression as: 



EHf) = - / d6p(0\D, If)H[o\6, D, If] 
dop(o\0,T>,If)H[0\T>,If] 
+ / de P (0\D,I f )H[o\I),If]. (7) 



Note that the entropies H[9\T), I f] and H[o\T),If] can be 
extracted from the integrals because they do not depend on 
the integration variable. Given that the probability distribu- 
tions are normalized to unit area, the expression simplifies 
to: 



EU(/) = -J d0p(9\H,I f )H[o\e,-D,If] 

- H[e\D,I f ]+H[o\D,I f ]. (8) 



3.2 Simplifications 

Clearly, the second term (the entropy of the posterior distri- 
bution considering the existing data) is independent of the 
election of the new filter, so it becomes constant with re- 
spect to / and can be dropped from the computation. The 
term H[o\0, D, If], which is the entropy of the likelihood for 
the new observed point at filter /, can be assumed to be 
constant if the expected noise is independent of the mea- 
surement. This is the case when the measurement for the 
new filter / is done under the presence of additive (Gaus- 
sian) noise with a variance that is independent of /. How- 
ever, this is not usually the case since the noise variance at 
different wavelengths can vary a lot. Assuming that the ob- 
servation o is perturbed with Gaussian noise with variance 
af, the likelihood p(o\9, D, If) is a Gaussian distribution, so 
its entropy is analytically expressed as: 

H[o\0,T>, / / ] = ilog(27re<r?). (9) 

Given that the previous entropy is independent of 9, it can 
be extracted from the integral in Eq. ([8} and use the fact 
that the posterior is normalized to unit area. Therefore, the 
final expression for the expected utility that we use in this 
work simplifies to: 



EU(/) 



i log (2irecT 2 f)-J dop(o\D, I f ) logp(o|D, J», (10) 

with p(o|D, If) given by Eq. |T]). Note that if the noise vari- 
ance does not dep end on /, we end up with the e xpected 
utility derived by ISebastiani fc Wvnnl (|2000h and ILoredd 
( 2004). They noted that the previous expression leads to a 
maximum entropy sampling, so that we select the new filter 
where we know the least. In other words, we should sample 
where the current values of the model parameters allow for 
a large variability of the SED. 



3.3 Technicalities 

The com putation of th e integral of Eq. (|f 0[) can be done 
following lLoredol (120041 ). who used a Monte Carlo estimation 
technique. First, we can compute a Monte Carlo estimation 
ofp(o|D,7/) using: 



p(o\T>,If 



AW 



AW ^ 

2=1 



p(o\9j,T),If), 



(11) 



where the 9j are A pos samples from the posterior distribu- 
tion p(9\D,If) that have been previously computed with 
BayesCLUMPY for the current set of observations. Then, the 
integral of Eq. {TO}, the entropy of the predictive distribu- 
tion, is computed using another Monte Carlo estimation: 



-prod 

H[o\T>,If] « — - V logpfelD,/, 

JVp rc d * ' 



(12) 



3=1 



where the Oj are samples from p(o|D, //). However, we have 
tested that better results are obtained if a simple trapezoidal 
quadrature is used to compute the integral of Eq. (|10[) . To 
this end, we discretize the posible range of variation of o in 
bins. For each bin, we compute p(o\T),If) using the Monte 
Carlo estimation of Eq. (|f 1[) . At the end, we compute the 
entropy of the predictive distribution as: 
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Table 3. Measured flux densities 



Filters NGC 1566 NGC 4151 Circinus NGC 1068 avg Syl avg Sy2 



NICMOS F110W 


- 


60±6 


- 


9.8±2 


- 


- 


NICMOS F160W 


- 


100±10 


1.6±0.2 


98±15 


0.05±0.01 


O.OOliO.001 


NICMOS F222M 




197±20 




445±100 








1 1 -t-n 1 




A 8-1-0 7 




u.u / nzu.uo 


U.UUOICU.UUZ 


NACO Ks 


2.1±0.1 




19±2 




0.12±0.05 


0.023±0.018 


NACO 2.42 






31±3 








NACO L' 


7.8±0.1 




380±38 




0.35±0.11 


0.15±0.09 


NACO M' 






1900±190 




0.48±0.13 


0.31±0.09 


NSFCam L 




325±65 










IRCAM3 M 




449±34 




2270±341 






TIMMI2 L 








920±138 






OSCIR N 




1320±200 










OSCIR IHW18 




3200±800 










T-ReCS Si2 


29±4 




5620±843 


10000±1500 


1.00±0.15 


1.00±0.15 


T-ReCS Qa 


63±9 




12791±3198 


21773±5443 


4.46±2.19 


4.36±1.93 


VISIR PAH2.2 


117±29 












VISIR Q2 


128±32 













Fluxes in m Jy from iRamos Almeida et al.l J2009I . l2011bh and lAlonso-Herrero et al.l d201 if) . In the case of 



the Syl and Sy2 templates, the fluxes have been normalized to the central wavelength of the T-ReCS Si2 
filter. 




1 10 100 1000 1 10 100 1000 




1 10 100 1000 1 10 100 1000 




1 10 100 1000 1 10 100 1000 

Wavelength [jim] 



Figure 2. IR SEDs used in this paper. The circles with the error bars are the observed points and they are drawn at the central 
wavelength of each filter. The left panels correspond to Syl galaxies, while the right panels are Sy2. The average Syl and Sy2 SEDs have 
been obtained by a simple mean of galaxies belonging to each group once their SEDs have been interpolated onto the filters in which 
Circinus was observed (REF). Fluxes are normalized to the data point closer to 10 fira. The grey curves are the CLUMPY SEDs that 
have been sampled from the posterior. 
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#[°I D > If] w ^2i W iP{°j I D j If) logpfelD, //), (13) 



with Wj the standard trapezoidal weights (e.g.. IPress et all 
1 1986ft . 



4 RESULTS 

The Bayesian Adaptive Exploration theory, as summarized 
in the previous section, is applied to the case of SEDs 
generated by the emission of a clumpy torus around an 
A GN. Under the assum ption that the parametric models 
of iNenkova et all (|2008al fbT) are representative of the under- 
lying physics, our analysis allows us to search for the filter 
that introduces more restrictions on the model parameters. 

Our set of potential filters is listed in Table [2] Our aim 
is to consider a sufficiently complete list of photometric fil- 
ters available in many medium- to large-aperture telescopes. 
In particular, we are considering all the optical, NIR, MIR 
and far-IR (FIR) filters that we have employed in our pre- 
vious work, plus four new MIR filters from the CanariCam 
instrument, which has recently started to operate at the 10 
m Gran Telescopio Canarias (GTC), and another four fil- 
ters from the Long Wavelength Camera (LWC) on the FOR- 
CAST instrument. FORCAST is a MIR/FIR camera for the 
Stratospheric Obs ervatory For Infrared Astronomy (SOFIA; 
I Young et all 12012} ) . Finally, to inspect whether or not sub- 
mm data can set any constraints on the model parameters, 
we have also added channels 7 and 9 of ALMA, although 
they cannot be considered filters per se. They are simulated 
as a square transmission efficiency with a peak of 80% in 
the ranges [800, 1090] fim for channel 7 and [420, 500] /an 
for channel 9. Table [2] presents the filter identification, to- 
gether with the instrument and telescope in which they are 
mounted. Additionally, we have tabulated the central wave- 
length of the filter, computed as Ao = J XR(\)dX/ J R(X)dX, 
where i?(A) is the transmission efficiency of the filter, and 
the full- width at half maximum (FWHM). The transmission 
efficiency of the filters have been obtained from the litera- 
ture. 

It is important to note that, for the sake of simplic- 
ity, in this work we are not taking into account the angular 
resolution provided by the different instruments considered 
here. In other words, we are assuming that all instruments 
listed in Table [2] are equivalent except in the wavelength 
coverage. However, the reader must keep in mind that high 
spatial resolution is mandatory when trying to isolate the 
torus emission. Thus, aside from the expected utilities de- 
rived here for the different instruments, it is important to 
consider the resolution when deciding which should be the 
next observation. 

To demonstrate our approach, we have used four SEDs, 
two representative of Seyfert 1 (Syl) and another two rep- 
resentative of Sey fert 2 (Sy2) galaxies. These SEDs have 
been discussed by Ramos Almeida et alj (|2009l . l2011bl ) and 
lAlonso-Herrero et al.l 1 201 if ) using a deep Bayesian analysis. 
Additionally, here we use the Syl and Sy2 avera ge tem- 
plates presented in iRamos Almeida et~aiT (|2011bl L These 
mean templates were constructed using individual Syl and 
Sy2 SEDs of angular resolution <0.55 arcsec, that were first 
interpolated onto the Circinus wavelength grid (1.265, 1.60, 



2.18, 3.80, 4.80, 8.74, and 18.3 fim), and then normalized to 
the central wavelength of the T-ReCS Si2 filter (8.74 fxm). 
These flux densities and their associated errors are shown in 
Tableland are displayed in graphical form in Fig.[2l normal- 
ized to the filter that is closer to 10 ^tm. Note that the SEDs 
that we have selected have quite dense samplings, as opposed 
to oth er sparsely sampled ones (|Ramos Almeida et al.|[2009l . 
l2011bh . 

For each SED we have carried out a full Bayesian analy- 
sis using BayesCLUMPY. This analysis relies on sampling the 
posterior distribution function as a function of the model 
parameters (shown in Table Q]). We have used flat priors for 
all model parameters. For the Syl galaxies, we have addi- 
tionally included an AGN spectrum, which is added to the 
CLUMPY SED. The shape of the AGN spectrum is the same 
that is co nsidered for the infra red pumping of the dust in 
the torus |Nenkova e t al. 2008a|). Likewise, we also take into 
account a certain amount of extinction following the law of 
IChiar et al.l (|2000h . This introduces the free parameter Av, 
which is included into the inference for Syl galaxies with a 
flat prior in the range [0, 5] mag. The inclusion of these two 
ingredients turns out to be essential for obtaining reliable 
results for Syl sources. 

BayesCLUMPY generates samples of model parameters 
distributed according to the posterior distribution. Figure [2] 
shows, apart from the observed points, a representation of 
the SEDs reconstructed from the model parameters sampled 
from the posterior. Note that the sampled SEDs have a small 
variability close to the observed points with the smaller error 
bars. In principle, a nd according to the maxim um entropy 
sampl ing derived bvlSebastia ni fc Wvnnl |2000l ) and lLoredd 
(2004), the filter that we should select has to be located close 
to the wavelength where the SEDs contain larger variabil- 
ity with the observational information we have acquired. An 
evaluation of Eq. (|10|) . following the ideas of Sec. 13.31 gives 
the results displayed in the left column of Fig. This figure 
represents the value of EU(/) for filter / versus the cen- 
tral wavelength of the filter. The noise associated to any 
new observation, 07, depends on the wavelength: we con- 
sider 10% of the value of the flux at A ^ 5 fim, 20% at 
5 /im < A < 25 /im and 30% at A ^ 25 /jm. These are the 
typical uncertainties that we m easured for individual galax- 
ies in ou r previous work (e.g.. Ramos Almeida et al J 12009 . 



l2011al lbl; lAlonso-Herrero et alj|2011 



The expected utility is usually measured in terms of 
"nats" (or bits), i.e., units of information. However, since 
we are not interested in their actual value but in relative 
measures, we normalize the plots to the maximum. Several 
properties are evident. First, the lack of a vertical spread on 
the plots for a given wavelength means that the expected 
utility of the filters depends essentially on the wavelength. 
Using different filters with the same central wavelength but 
different transmission properties produce the same increase 
in the information acquired. In other words, the results are 
not dependent on the exact shape of the filter transmission 
curve. Obviously, other reasons (atmospheric transmission, 
telescope diameter, availability, etc.) would make one partic- 
ular filter preferable among a set of equivalent filters. How- 
ever, under the same conditions, they introduce roughly the 
same information for constraining CLUMPY models. 

In general, the region between 10 and 200 /im produces 
the largest increase in the expected utility. Out of the filters 
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Figure 3. Expected utility (EU) computed using Eq. JHJ and normalized to the maximum of each panel. EU is calculated for all filters 
not present in the observations. The first column refers to the case in which we use all the available observations of the galaxy. The central 
and right columns refer to the cases in which we only keep observations below and above 4 fim, respectively. For an easier visualization, 
the different wavelength ranges have been color-coded. 

Black dots correspond to filters with Ao ^ 6 fim, red dots to 6 < Ao ^ 25 (im, blue dots to 25 < Ao ^ 200 (im and green dots to 

A > 200 fim. 



considered here, those from SOFIA provide the largest con- 
straining power for the CLUMPY models. However, the poor 
spatial resolution of SOFIA (~3-4 arcsec) represents a clear 
drawback and it might be preferable to choose other filters 
with slightly smaller expected utility but better spatial reso- 
lution. The filters in the Q-band, especially the CanariCam 



Q8 filter (Ao =24.5 /mi), are the next most useful ones for 
constraining the clumpy torus model parameters, specially 
if we consider their good spatial resolution (~0.55 arcsec). 
They are closely followed by filters in the N-band (~8-12 
fim) and by the Herschel Space Observatory PACS filters, 
especially PACS70 and PACS100. The previous results are 



8 A. Asensio Ramos & C. Ramos Almeida 




not surprising considering that the bulk of torus emission is 
observed in the MIR, peaking at ~20 /im. The PACS obser- 
vations, which cover the spectral range from 70 to 160 /im, 
are sampling cooler dust within the torus, and thus helping 
to constrain parameters such as the torus radial extent (V) 
and the radial distribution of the clouds (q), as discussed in 
iRamos Almeida etlrtl (|2011al ). 

According to our results, data points obtained with the 
SPIRE instrument on Herschel -despite its low spatial res- 
olution, which we are neglecting here- still have a relatively 
large expected utility, and the same happens with sub-mm 
data from ALMA. We have considered two ALMA channels 
(7 and 9) and the expected utility of channel 9 (Ao = 460 
/xm) is comparable to that provided by the NIR data in the 
case of the Sy2 galaxies. This result is encouraging for poten- 
tial ALMA users looking for constraints on torus properties. 
ALMA will observe in the sub-mm regime with unprece- 
dented angular resolution, and according to the analysis pre- 
sented here, data obtained with channel 9 can provide useful 
constraints on torus model parameters. The case of Herschel 
SPIRE and ALMA clearly illustrates what we discussed at 
the beginning of this section: although the expected utili- 
ties measured for the SPIRE filters are slightly higher than 
those for ALMA channel 9, the angular resolution provided 
by ALMA makes it a better choice than SPIRE to sample 
the cool dust within the torus. On the other hand, the ex- 
pected utility of channel 7 (Ao = 945 /tm) is negligible in all 
the cases considered. 

It is interesting to note that there is no apparent differ- 
ence between Syl and Sy2 galaxies regarding the proposed 
new niters at wavelengths >4 /im. Therefore, it seems that 
spectral features that are supposed to serve to constrain the 
model parameters because they are generally different in Syl 
and Sy2 (such as the silicate feature at 10 /jm, generally in 
emission in Syl and in absorption in Sy2) do not make any 
difference in suggesting new observations. It is worth noting, 
though, that here we are using photometry only, which can 



provide just an insight of spectral features such as the sili- 
cate band. The use of high angular resolution spectroscopy 
in this spectral region (provided by ground-based instru- 
ments such as VISIR, MICHELLE and CanariCam) can set 
constraints on the clumpy torus models th at the photometry 
alone cannot l|Alonso-Herrero et al.ll201lh . Thus, the use of 
spectroscopy would possibly change the suggested observa- 
tions. Unfortunately, the expression for the expected utility 
of Eq. (|10|) in the spectroscopic case requires the compu- 
tation of a multidimensional integral because the expected 
observation o transforms into the vector o. Tailor-made al- 
gorithms which are computationally heavy have to be ap- 
plied to compute this integral. For this reason, we defer the 
spectroscopic case to a later study. 

At wavelengths <4 /jm we start to see a difference be- 
tween the expected utilities of Syl and Sy2. Optical and NIR 
observations help constraining the torus parameters for Syl 
(expected utilities between 0.4 and 0.8) more than for Sy2 
(expected utilities ~0.2 around 1 /im). The hot dust emis- 
sion from the directly illuminated faces of the clumps close 
to the central engine and the direct AGN emission strongly 
flatten the Syl IR SEDs (see Figure [2}. Thus, in the case of 
Syl, more observations at short wavelengths are required to 
constrain the shape of the SED in this region, that in turn 
constrains parameters such as the inclination angle of the 
torus (i). 

As an experiment, we have simulated which is the in- 
fluence of having a reduced set of observations. To this aim, 
we have considered the observed points below and above 4 
/im from the complete SEDs. This separation was chosen to 
roughly separate the NIR and the MIR keeping a similar 
number of points in both spectral domains. The resulting 
SEDs are shown in the left and right six panels of Fig. [4] re- 
spectively, using the same normalization as in Fig. [2] Again, 
the grey curves are samples of SEDs using the model pa- 
rameters from the Bayesian inference. Despite the sparser 
sampling of the SEDs in the two cases, the results remain 



Constraining clumpy dusty torus models 9 




20 40 60 80 



Figure 5. Sim ulation of a Bayesian adaptive exploration experiment. A new observed point is added in each column from those available 
for NGC 3081 (Ramos Almeida ct al. 201 laj)- The first column shows the observations with their associated error bars, together with 
the SEDs obtained from the posterior distribution in each case. The second column shows the expected utility displaying the next best 
filter. The next four columns represent the marginal posteriors of the CLUMPY parameters ct, Y, Nq and i, respectively. 



essentially similar (see central and right panels of Fig [3}, 
with the expected utility peaking in the 10-200 ^im region. 
The only difference in the results is a small increase in the 
expected utitility of the optical and NIR data when only 
A >4 fim data is considered in the Sy2 SEDs (bottom right 
panels in Figure [3]). 



5 SIMULATED ITERATED BAE FOR NGC 
3081 

The process we have described until this point is obvi- 
ously not complete. As already described in Sec. [3] the 
Bayesian adaptive exploration scheme starts from observa- 
tions, and then applies Bayesian inference tools to design 
a new potentially informative observation. Since the typ- 
ical timescale of this process can be very large given the 
necessity to apply for observing time in large-aperture tele- 
scopes, we have carried out a simulated full process in Fig. 
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[U We have used the SEP of the Sy2 ga laxy NGC 3081 
presented in iRamos Almeida et"afl (|2011al l. The reason for 
choosing this galaxy is the availability of Herschel PACS 
nuclear fluxes. As a first step, we have assumed that we 
only have the observation in the NICMOS F160W filter and 
let the Bayesian adaptive exploration scheme select the fol- 
lowing filters. BayesCLUMPY then uses this point to sam- 
ple the posterior distribution and obtain marginal posteriors 
for the clumpy model parameters. The observed point, to- 
gether with the SEDs sampled from the posterior are shown 
in the first column of Fig. [5] The second column shows 
the expected utility, while the remaining panels display the 
marginal posterior for a, Y , No and i, respectively. 

In order to simulate the next experiment, we have se- 
lected the next observation from the set of observed points 
of the full SED that has the largest expected utility, which 
turns out to be the T-ReCS Qa data point (Ao = 18.3 /im). 
The full process is then repeated until having all the ob- 
served points of the SED in the lowest panel. The order in 
which the remaining filters have been selected is: PACS70, 
PACS100, PACS160, VISIR NeII_2 (Ao = 13.04 /im) and 
finally, T-ReCS Si2 (Ao = 8.72 /im). Given that the dif- 
ferences in expected utility are quite small and can change 
a little because of the inherently random character of the 
Montecarlo sampling done with BayesCLUMPY the specific 
order might change from run to run. In fact, this means that 
one could choose a filter giving a large expected utility even 
though it is not the one giving the largest value if observing 
with this filter is easier for some specific reason. 

It is important to note how the marginal posteriors 
are constrained when augmenting the number of observed 
points. Once there are two points in the NIR and MIR, the 
marginal posterior of a is very similar to what we find when 
considering the full SED. This is not the case for Y and No, 
that considerably change after introducing the FIR data and 
the N-band data points (NeII_2 and Si2 filters). Finally, the 
FIR does not seem to make any difference to the inclination 
angle of the torus, i, but the addition of the N-band data 
points -especially the Si2 filter- result in a different poste- 
rior shape, favouring intermediate torus inclinations. 



6 CONCLUSIONS 

The CLUMPY models have recently gained much success 
on explaining the observed IR SED of the inner parsecs 
of nearby Seyfert galaxies. Given the observational diffi- 
culty (one needs to obtain good signal-to-noise IR pho- 
tometry of sources at subarcsecond resolution, and ideally, 
spectroscopy), the sampling of the SEDs is generally quite 
scarce. Assuming that the CLUMPY models are success- 
ful in explaining the observed SED, we have applied a 
Bayesian adaptive exploration scheme to propose new ob- 
servations given the presently available information about a 
source. The results quantitatively indicate that the region 
between 10 and 200 /im is crucial, almost independently of 
the presently available observed points. We also find that 
optical and NIR data have higher expected utilities in the 
case of Syl than in Sy2. Finally, data from 200 to 500 /im 
are found to be useful, even if not as much as data in the 10- 
200 /im range, for constraining the torus model parameters. 
It is important to note that, for simplicity, here we are not 



considering the angular resolution provided by the different 
instruments. Thus, aside from the measured expected utility 
of a given filter, it is important to consider the resolution of 
the data when deciding the next observation. 

We performed a BAE experiment for the galaxy NGC 
3081 and found that having two data points only -one in the 
NIR and another in the MIR- significantly constrains the the 
torus width (a), which does not change when adding further 
IR data points. The addition of FIR data helps constraining 
the torus radial extent (Y) and the number of clouds (No). 
Finally, the combination of the J, Si2 and Q filters appears 
to be the most suitable to constrain the inclination angle of 
the torus (i). Note, however, that these are the results found 
for a particular galaxy, and they may not be applicable to 
all Seyferts. 

Given the large pressure on large-aperture telescopes, 
our approach -when combined with spatial resolution 
considerations- is very promising for constraining CLUMPY 
models, and possibly torus models in general, in sparsely 
sampled SEDs with the smallest number of observed points. 
The Bayesian adaptive exploration scheme can be applied 
to any parametric model (e.g., smooth torus, light curves of 
classical eclipsing binaries, x-ray binaries) once the tools to 
carry out a fully Bayesian inference are available. The ap- 
plication to the CLUMPY models is straightforward thanks 
to the BayesCLUMPY tool. We note that the computation of 
the expected utility is already included in the public version 
of BayesCLUMPY, which is available for the community. 
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